Learning Classification Systems Maximizing the Area under the Roc Curve
نویسنده
چکیده
Permission is herewith granted to Università degli Studi di Cassino to circulate and to have copied for non-commercial purposes, at its discretion, the above title upon the request of individuals or institutions. Acknowledgements This work would not have been possible without the support I received from many people. A big thank you to all who have helped me in some way or other to complete this thesis. The most important person I would like to thank is my supervisor: Francesco Tor-torella. In these three years he has guided me through this work. Having him as supervisor is a continuous source of motivation and gives you the feeling that you can do something useful. He is always ready to give you advice when you need, encouraging you to pursue your individual interest. I learnt a lot from our discussions on ROC curve and life. Thanks! I would also like to thank prof. Bob Duin of the TU Delft. In Netherlands, I spent five beautiful months and his help was important to complete this thesis. Then, my special thanks to all members of the LIT group, who ensured that it was always funny working there. Thanks for all the support during our " numerous " coffee breaks of everyday! They are friends more than colleagues. Finally, my family, what would I have been without them? Thanks for everything. vii Summary This thesis is concerned with supervised classification problems in which the aim is to build a rule to assign objects to one of a finite set of classes. Systems able to perform these operations using a set of known examples are called classifiers. In particular, this work focuses on problems where we have to distinguish between two mutually exclusive classes. In this case, many distinct criteria for comparing performance of rules can be used. In this thesis an analysis of the Receiver Operating Characteristics (ROC) curve methodology in pattern recognition is performed and the use of the Area under the ROC curve (AUC) as performance measure for building dichotomizers and combination rules is proposed. The thesis is organized as follows: • Chapter 1: we introduce the framework of pattern recognition in which this work is placed. Starting from the basis of statistical pattern recognition we introduce the main problems of the topics of this thesis that are the two-class classification and in this context the combination of classifiers. • Chapter 2: the …
منابع مشابه
Risk Estimation by Maximizing the Area under ROC Curve
Risks exist in many different domains; medical diagnoses, financial markets, fraud detection and insurance policies are some examples. Various risk measures and risk estimation systems have hitherto been proposed and this paper suggests a new risk estimation method. Risk estimation by maximizing the area under a receiver operating characteristics (ROC) curve (REMARC) defines risk estimation as ...
متن کاملLae-Jeong Park and Jung-Ho Moon A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range
This paper addresses an effective learning method that enables us to directly optimize neural network classifier's discrimination performance at a desired local operating range by maximizing a partial area under a receiver operating characteristic (ROC) or domain-specific curve, which is difficult to achieve with classification accuracy or mean squared error (MSE)-based learning methods. The ef...
متن کاملTechnical Report No: BU-CE-1001 A Discretization Method based on Maximizing the Area Under ROC Curve
We present a new discretization method based on Area under ROC Curve (AUC) measure. Maximum Area under ROC Curve Based Discretization (MAD) is a global, static and supervised discretization method. It discretizes a continuous feature in a way that the AUC based only on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as Entropy-MDLP (...
متن کاملPredicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کاملReceiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation
This review provides the basic principle and rational for ROC analysis of rating and continuous diagnostic test results versus a gold standard. Derived indexes of accuracy, in particular area under the curve (AUC) has a meaningful interpretation for disease classification from healthy subjects. The methods of estimate of AUC and its testing in single diagnostic test and also comparative studies...
متن کاملMelanoma detection with a deep learning model
Background: Skin cancer is one of the most common forms of cancer in the world and melanoma is the deadliest type of skin cancer. Both melanoma and melanocytic nevi begin in melanocytes (cells that produce melanin). However, melanocytic nevi are benign whereas melanoma is malignant. This work proposes a deep learning model for classification of these two lesions. Methods: In this analytic s...
متن کامل